Taming big data: Applying the experimental method to naturalistic data sets
نویسندگان
چکیده
منابع مشابه
Taming Wild Big Data
Wild Big Data (WBD) is data that is hard to extract, understand, and use due to its heterogeneous nature and volume. It typically comes without a schema, is obtained from multiple sources and provides a challenge for information extraction and integration. We describe a way to subduing WBD that uses techniques and resources that are popular for processing natural language text. The approach is ...
متن کاملBig Data Quality: From Content to Context
Over the last 20 years, and particularly with the advent of Big Data and analytics, the research area around Data and Information Quality (DIQ) is still a fast growing research area. There are many views and streams in DIQ research, generally aiming at improving the effectiveness of decision making in organizations. Although there are a lot of researches aimed at clarifying the role of BIG data...
متن کاملTaming Biological Big Data with D4M
The growth of large, unstructured data sets is driving the development of new tech nologies for finding items of interest in these data. Because of the tremendous expansion of data from DNA sequencing, bioinformatics has become an active area of research in the supercomputing com munity [1, 2]. The Dynamic Distributed Dimensional Data Model (D4M) developed at Lincoln Laboratory, and availabl...
متن کاملApplying Method Data Dependence
Concurrency control in object based systems is a new area of research that has only just begun to be addressed. Recent work has proposed using object level locking but this may be unnecessarily restrictive. Data level locking within an object increases concurrency but also introduces substantial lock overhead that makes the approach impractical. This paper proposes using a new approach applying...
متن کاملApplying Stratosphere for Big Data Analytics
Analyzing big data sets as they occur in modern business and science applications requires query languages that allow for the specification of complex data processing tasks. Moreover, these ideally declarative query specifications have to be optimized, parallelized and scheduled for processing on massively parallel data processing platforms. This paper demonstrates the application of Stratosphe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Behavior Research Methods
سال: 2019
ISSN: 1554-3528
DOI: 10.3758/s13428-018-1185-6